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This listing of claims will replace all prior versions, and listings, of claims in the application. 
Listing of Claims: 

Claims 1-17. (Canceled). 

18. (Currently Amended) A method of controlling a system to optimize an 
objective function thereof, the system being capable of performing a plurality of candidate 
actions and being capable of monitoring response performances of a performance of a 
respective candidate action, the method comprising the steps of: 

a) monitoring response performance of a respective candidate action that is chosen to 
be performed bv the svstem ; 

b) storing, according to candidate action performed bv the svstem, a representation of 
said monitored response performance; 

c) choosing which of the plurality of candidate actions is next performed bv the 
svstem so as to optimize said objective function by assessing, using the probability 
distribution of the response performance of all of said plurality of candidate actions, which 
candidate action is estimated to result in the lowest expected growth in regret after the chosen 
candidate action is performed bv the svstem; 

d) commanding the svstem to perform the candidate action identified to be the next 
performed in step c); and 

d) e) repeating steps a) to d) to control the svstem so as to substantially optimize 
the objective function of the system; 

where regret is a term that represents a system performance measure that considers 
the relative merit of exploration of one or more apparently non-best candidate actions, with 
respect to the relative merit of exploiting what appears to be the current best candidate action 
based on historical response performances to date. 

19. (Previously presented) A method according to claim 18 wherein step c) 
includes assessing which candidate action is likely to result in the lowest expected growth in 
regret on the basis of a true best candidate action which has the mean of said probability 
distribution. 
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20. (Previously presented) A method according to claim 18 wherein step c) 
includes evaluating the cost or losses associated with presenting a lower performing 
candidate action and the gain or benefit associated with knowing the true position of the 
current best observed candidate action on said probability distribution. 

2 1 . (Previously presented) A method according to claim 20 wherein step c) 
includes assessing which candidate action is likely to result in the lowest expected growth in 
regret according to an assumption that the current best observed candidate action is assumed 
to have zero uncertainty around its mean or expected response performance. 

22. (Previously presented) A method according to claim 18 wherein step c) 
includes assessing which candidate action is likely to result in the lowest expected growth in 
regret according to an assumption of a Student's distribution and evaluation of Student's t 
parameters as the basis for estimating probabilities of imequal or equal response states 
between the candidate action with the current expected best response performance and any 
other candidate action. 

23. (Currently Amended) A method according to claim 18 wherein step c) 
includes using a Monte Carlo algorithm to provide understanding of the probability 
distribution of the response performance of all of the plurality of candidate actions and either 
choosing the candidate action that if not taken would contribute[s] most to an expected regret 
estimate, or choosing a candidate action with probability proportional to its contribution to 
the expected regret estimate if not taken . 

24. (Currently Amended) A method according to claim 18 further comprising the 

step of: 

e) D applying a temporal depreciation factor to the stored representations of the 
response performance in order to depreciate the significance of the stored representations 
over time. 
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25. (Currently Amended) A method according to claim 24 wherein step e) 
includes applying, for each candidate action, a different temporal depreciation factor to the 
stored representations of the response performance thereof. 

26. (Currently Amended) A method according to claim 18 further comprising the 

step of: 

e) li forcing the performance of each candidate action a minimum number of times or 
at a minimum rate. 

27. (Previously presented) A method of controlling a system according to claim 
1 8 wherein the system comprises a robot. 

28. (Currently Amended) A method of controlling a system having two or more 
ranks of control arranged in a hierarchy, wherein each rank of control has a respective 
objective function and is capable of performing a plurality of candidate actions for that rank 
of control in the hierarchy, wherein the candidate action of a rank of control at the lowest 
level in the hierarchy represents the output candidate action selected to be performed by the 
system, and wherein the candidate action of a can r e pr e s e nt a low e r rank of control not at the 
lowest level in the hierarchy represents the selection of a lower rank of control in the 
hierarchy, and wherein the method by which each individual compriping controlling said rank 
of control operates is performed according to the method of claim 18. 

29. (Currently Amended) A method according to claim 28 wherein 
representations of said the monitored response performance s of the lowest level ranks of 
control are all visible and accessible to the rank of control immediately above in the 
hierarchy, for the purposes of appraising the probability distribution of the response 
performance of all of said plurality of candidate actions otor e d in stop b) are shar e d with said 
ranlc of control . 

30. (Previously presented) A method according to claim 1 8 wherein the 

monitored response performance of a respective candidate action in step a) is stored in step b) 
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in a form to enable sharing of the stored representation of said monitored response 
performance with another system. 

3 1 . (Currently Amended) A system having a control apparatus that is 
programmed to control the objective function of the system controlled according to the 
method of claim 18. 

32. (Currently Amended) A robot controll e d system according to th e m e thod of 
claim 4* 3 1 , where the system comprises a robot . 

33. (Currently Amended) A control apparatus operating according to th e m e thod 
of claim 18 to control for controlling a system to optimize an objective function thereof the 
system being capable of performing a plurality of candidate actions and being capable of 
monitoring response performances of a performance of a respective candidate action, the 
control apparatus programmed to perform the steps of: 

a) monitoring response performance of a respective candidate action that is chosen to 
be performed by the system; 

b^ storing, according to candidate action performed by the system, a representation of 
said monitored response performance: 

c) choosing which of the plurality of candidate actions is next performed by the 
system so as to optimize said objective function by assessing, using the probability 
distribution of the response performance of all of said plurality of candidate actions, which 
candidate action is estimated to result in the lowest expected growth in regret after the chosen 
candidate action is performed by the system: 

d) commanding the system to perform the candidate action identified to be the next 
performed in step c): and 

e) repeating steps a) to d) to control the system so as to substantially optimize the 
objective function of the system: 

where regret is a term that represents a system performance measure that considers 
the relative merit of exploration of one or more apparently non-best candidate actions, with 
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respect to the relative merit of exploiting what appears to be the current best candidate action 
based on historical response performances to date . 

34. (New) A method according to claim 18 wherein the representation of said 
monitored response performance contains at least one variable that characterizes the 
conditions under which the candidate action was performed. 
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